6 research outputs found
SHAIL: Safety-Aware Hierarchical Adversarial Imitation Learning for Autonomous Driving in Urban Environments
Designing a safe and human-like decision-making system for an autonomous
vehicle is a challenging task. Generative imitation learning is one possible
approach for automating policy-building by leveraging both real-world and
simulated decisions. Previous work that applies generative imitation learning
to autonomous driving policies focuses on learning a low-level controller for
simple settings. However, to scale to complex settings, many autonomous driving
systems combine fixed, safe, optimization-based low-level controllers with
high-level decision-making logic that selects the appropriate task and
associated controller. In this paper, we attempt to bridge this gap in
complexity by employing Safety-Aware Hierarchical Adversarial Imitation
Learning (SHAIL), a method for learning a high-level policy that selects from a
set of low-level controller instances in a way that imitates low-level driving
data on-policy. We introduce an urban roundabout simulator that controls
non-ego vehicles using real data from the Interaction dataset. We then show
empirically that our approach can produce better behavior than previous
approaches in driver imitation which have difficulty scaling to complex
environments. Our implementation is available at
https://github.com/sisl/InteractionImitation
Constrained Hierarchical Monte Carlo Belief-State Planning
Optimal plans in Constrained Partially Observable Markov Decision Processes
(CPOMDPs) maximize reward objectives while satisfying hard cost constraints,
generalizing safe planning under state and transition uncertainty.
Unfortunately, online CPOMDP planning is extremely difficult in large or
continuous problem domains. In many large robotic domains, hierarchical
decomposition can simplify planning by using tools for low-level control given
high-level action primitives (options). We introduce Constrained Options Belief
Tree Search (COBeTS) to leverage this hierarchy and scale online search-based
CPOMDP planning to large robotic problems. We show that if primitive option
controllers are defined to satisfy assigned constraint budgets, then COBeTS
will satisfy constraints anytime. Otherwise, COBeTS will guide the search
towards a safe sequence of option primitives, and hierarchical monitoring can
be used to achieve runtime safety. We demonstrate COBeTS in several
safety-critical, constrained partially observable robotic domains, showing that
it can plan successfully in continuous CPOMDPs while non-hierarchical baselines
cannot.Comment: Under review for the 2024 IEEE International Conference on Robotics
and Automation (ICRA
Online Planning for Constrained POMDPs with Continuous Spaces through Dual Ascent
Rather than augmenting rewards with penalties for undesired behavior, Constrained Partially Observable Markov Decision Processes (CPOMDPs) plan safely by imposing inviolable hard constraint value budgets. Previous work performing online planning for CPOMDPs has only been applied to discrete action and observation spaces. In this work, we propose algorithms for online CPOMDP planning for continuous state, action, and observation spaces by combining dual ascent with progressive widening. We empirically compare the effectiveness of our proposed algorithms on continuous CPOMDPs that model both toy and real-world safety-critical problems. Additionally, we compare against the use of online solvers for continuous unconstrained POMDPs that scalarize cost constraints into rewards and highlight the limitations of the default exploration scheme